| 
 June 14, 2023
 By Bob O'Donnell One of the  most indisputable benefactors of the generative AI phenomenon has been the GPU,  a chip that made its initial mark on the world as a graphics accelerator for  gaming. As it turns out, GPUs have proven to be extremely adept at enabling and  improving the process of training large foundation models and running AI inferencing  workloads as well. Up until now,  the big winner in the generative AI GPU game has been Nvidia, thanks to a  combination of strong hardware and a large installed base of CUDA software tools.  At an event in San Francisco this week, however, AMD came out with both new GPU  and CPU hardware and important new software partnerships and updates. Taken  together, AMD believes these announcements will help it take a bigger chunk of a  datacenter AI accelerator market it predicted will reach $150 billion dollars by  2027. The new Instinct  MI300X chip is what AMD referred to as a dedicated generative AI accelerator. Leveraging  the same basic chiplet-based design as the previously announced Instinct MI300A  (which the company also announced was now sampling), the MI300X replaces the 24  Zen4 CPU cores in the MI300A with additional CDNA3 GPU cores and High Bandwidth  Memory (HBM). In fact, the new chip—which includes a total of 153 billion  transistors—has 192 GB of HBM and offers 5.2 TB/second of memory bandwidth.  These represent a 2.4x increase in memory amount and 1.6x improvement in  throughput versus Nvidia’s current H100 accelerator. While those numbers are  almost hard to fathom for most any other application, large language models  (LLMs) run most efficiently in memory, so these should translate to solid  real-world performance when the chip starts sampling in the third quarter of  this year. In addition  to hardware, AMD also made several important announcements on the software  side. First, the company detailed the latest iteration of its ROCm platform for  AI software development. ROCm 5 consists of low-level libraries, compilers,  development tools and a runtime that allows AI-related workloads to run  natively on AMD’s Instinct line of GPU accelerators. It also sits as the base  upon which AI development frameworks such as PyTorch, TensorFlow and ONNX operate.  Speaking of which, one of the two big bits of software news from AMD’s event  was a new relationship with the PyTorch Foundation. Starting with PyTorch 2.0,  any AI models or applications built with PyTorch will run natively on AMD  Instinct accelerators that have been upgraded to support ROCm 5.4.2.  This is a  huge deal because a large number of AI models are being built with PyTorch and,  until this deal was announced, most could only run on Nvidia’s GPUs. Now model  and application developers, as well as major cloud computing providers, will  have the flexibility to either use AMD Instinct accelerators directly or even  swap out Nvidia accelerators with AMD ones. The other big  software announcement was with Hugging Face, which has quickly become the defacto  location for open-source AI models. As part of the new partnership, Hugging  Face will work with AMD to ensure that thousands of existing and all new open-source  models posted on its site will be made compatible with Instinct accelerators.  Longer term, the companies plan to also work on compatibility across other AMD  processors, including Epyc and Ryzen CPUs, Radeon GPUs, Alveo DPUs and Versal  FPGAs (or adaptive processors, as AMD calls them). Once again, this is  extremely important and should help position AMD as a much more viable  alternative to Nvidia in a number of AI datacenter environments. On the  datacenter CPU front, AMD also announced their new “Bergamo” and “Genoa X”  versions of their fourth generation Epyc processors. They also hinted at yet  another version called “Sienna” that they said would be announced later this  year. Bergamo is optimized for cloud computing workloads and uses a new smaller  Zen4c core and squeezes many more of these smaller cores onto the chip (up to  128). The refined architecture allows it to do things such as run more  containers simultaneously, resulting in impressive benchmarks that partners  including AWS, Azure and Meta were all happy to discuss in person as part of  the presentation. Genoa X pairs the company’s 3D V-Cache technology that they  first introduced with their 3rd generation “Milan” series with the 4th  generation Genoa CPU design. It’s optimized for technical and high-performance  computing (HPC) workloads that need access to more and faster on-die cache  memories.  What’s  interesting about all these CPU developments (as well as the variations on the  Instinct MI300 accelerator side) is that they reflect AMD’s growing diversity  of designs optimized for specific types of applications. The nice thing is they  can all leverage a number of core AMD technologies, including their chiplet-based  design and the Infinity Fabric interconnect technology that they created as  part of their first chiplet efforts. It’s a great example of how  forward-looking designs can have a very large and long-lasting impact on  overall strategies. One last bit  of AI hardware that AMD unveiled at their event was the Instinct Platform. Like  a conceptually similar offering from Nvidia, the Instinct Platform combines eight  of AMD’s GPU-based accelerators (MI300Xs) into a single, compact hardware design.  To be clear,  Nvidia still has an enormous lead on virtually anyone else when it comes to generative  AI training and a strong position in inferencing as well. Once these new  accelerators and software partnerships start to make their presence known,  however, they should make a meaningful impact for AMD. As much as companies  like what Nvidia has enabled for generative AI, the truth is, no one likes to  have a market dominated by a single player. As a result, many companies will  likely be eager to see AMD develop into a strong alternative in this space.  On the  datacenter CPU front, AMD has clearly developed into a strong alternative to  Intel, so it’s not hard to imagine the company starting to develop a similar profile  in datacenter AI accelerators. It won’t be easy, but it’s certainly going to  make things more interesting. Here’s a link to the original article: https://seekingalpha.com/article/4611443-amd-generative-ai-vision Bob O’Donnell is the president and  chief analyst of TECHnalysis Research, LLC a market research firm that provides strategic consulting and market research  services to the technology industry and professional financial community. You  can follow him on LinkedIn at Bob O’Donnell or on Twitter @bobodtech. |